Handwritten text localization in skewed documents
نویسندگان
چکیده
In this paper a system for handwritten text localization in document images is proposed. Our system performs skew angle correction using Wigner-Ville distribution and localizes the handwritten areas of the document based on several measures concerning regularity in shape (selfcorrelation, horizontal and vertical symmetry) and in dimensions (aspect ratio, distribution of heights and widths). The proposed technique was tested on a variety of documents and handled successfully more than 88% of the set while the misclassified areas in the rest documents didn’t exceed the six in no document.
منابع مشابه
Handwritten Text Line Segmentation by Clustering with Distance Metric Learning
Separating text lines in handwritten documents remains a challenge because the text lines are often ununiformly skewed and curved. In this paper, we propose a novel text line segmentation algorithm based on Minimal Spanning Tree (MST) clustering with distance metric learning. Given a distance metric, the connected components of document image are grouped into a tree structure. Text lines are ex...
متن کاملConnected Component Based Word Spotting on Persian Handwritten image documents
Word spotting is to make searchable unindexed image documents by locating word/words in a doc-ument image, given a query word. This problem is challenging, mainly due to the large numberof word classes with very small inter-class and substantial intra-class distances. In this paper, asegmentation-based word spotting method is presented for multi-writer Persian handwritten doc-...
متن کاملSegmentation of Touching, Overlapping, Skewed and Short Handwritten Text Lines
Text line segmentation is an inherent part of document recognition system and important preprocessing step for word and character segmentation. Presence of touching or overlapping text lines, short-lines, curvilinear or skewed lines and small or variant gaps between the text lines make the segmentation challenging. These variations cause errors in recognition phase. This paper describes the top...
متن کاملSegmentation of Touching, Overlapping, Skewed and Short Handwritten Text Lines
Text line segmentation is an inherent part of document recognition system and important preprocessing step for word and character segmentation. Presence of touching or overlapping text lines, short-lines, curvilinear or skewed lines and small or variant gaps between the text lines make the segmentation challenging. These variations cause errors in recognition phase. This paper describes the top...
متن کاملA new scheme for unconstrained handwritten text-line segmentation
Variations in inter-line gaps and skewed or curled text-lines are some of the challenging issues in segmentation of handwritten text-lines. Moreover, overlapping and touching text-lines that frequently appear in unconstrained handwritten text documents significantly increase segmentation complexities. In this paper, we propose a novel approach for unconstrained handwritten text-line segmentatio...
متن کامل